erplay within peptides.
logical question — protease cleavage pattern discovery
data used in either the protease cleavage pattern discovery or the
lational modification pattern discovery are peptides. A peptide is
equence cut from a long sequence, which normally is between a
dozen of residues long. There are two key problems of peptide
iscovery. They are the poly-protein cleavage activity caused by
proteases and the post-translational modification caused by
hemicals. Many diverse functions within proteins are due to a
ribution of these proteases and chemicals in nature. Both subjects
n extensively researched in the protein science. For instance, the
over many areas such as protein chemistry, proteomics, and
maceutical manufacture, peptide mass fingerprinting, protein
ation, protein domain separation and protease activity recognition
s, 1993].
ptide used for the protease cleavage pattern discovery is
y expressed as ܴ⋯ܴଶܴଵܴଵ
ᇱܴଶ
ᇱ⋯ܴᇱ, where ܴଵஸஸ stands for
e N-terminal residues and ܴଵஸஸ
ᇱ
stands for one of the C-terminal
The cleavage happens between ܴଵ and ܴଵ
ᇱ. A peptide used for the
slational modification pattern discovery is commonly expressed
ܴଶܴଵܴܴଵ
ᇱܴଶ
ᇱ⋯ܴᇱ, where ܴ is the modification site. Most
learning models constructed for peptide pattern discovery treat
as mutually independent variables. For instance, the binary
approach encodes each residue using a binary vector and the bio-
ction introduced in the Chapter 3 of this book employs a mutation
align two peptides so as to encode peptides. The alignment
two peptides is a linear sum of the mutation probabilities between
ides pair-wisely. Therefore, potential residue interplay has not
sidered into a modelling process of peptide pattern analysis using
oding approaches.
te of the assumption of mutual independency between residues in
chine learning models, the research into residue interplay has